Czech-English Bilingual Valency Lexicon Online

نویسندگان

  • Eva Fučíková
  • Jan Hajič
  • Jana Šindlerová
چکیده

We describe CzEngVallex, a bilingual Czech–English valency lexicon which aligns verbal valency frames and their arguments. It is based on a parallel Czech-English corpus, the Prague Czech-English Dependency Treebank (PCEDT), where for each occurrence of a verb, a reference to the underlying Czech and English valency lexicons (PDT-Vallex and CzEngVallex, respectively) is recorded. The CzEngVallex then pairs the entries (verb senses) of the two lexicons, and allows for detailed studies of verb valency and argument structure in translation and also compare the approaches to valency in the two languages on the background of the same underlying theory, the Functional Generative Description. The CzEngVallex lexicon is now accessible online, and we will also describe here the search interface which makes certain complex queries possible, using the lexicon and accessing the associated examples of verb sense translations, as extracted from the PCEDT corpus. 1 The PCEDT parallel corpus and its lexicons Valency, or verb argument structure, is an important phenomenon both in linguistic studies as well as in language technology applications, since the verb is considered the core of a clause in (almost) every natural language utterance. Various dictionaries have been built from Propbank [13] to Framenet [1] as well as various valency lexicons exist for several languages, such as Walenty [16] for Polish, Verbalex [8] or Vallex [9] for Czech, Valence Lexicon for a Treebank of German [3] for German etc. However, there are no truly multilingual valency dictionaries linked to corpora. The Prague Czech-English Dependency Treebank (PCEDT 2.0) [4] contains the WSJ part of the Penn Treebank [10] and its manual professional translation to Czech, annotated manually using the tectogrammatical representation [11], first used for the Prague Dependency Treebank 2.0 (PDT) [5]. The tectogrammatical representation is in turn based on the Functional Generative Description theory [17].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CzEngVallex: a Bilingual Czech-English Valency Lexicon

This paper introduces a new bilingual Czech-English verbal valency lexicon (called CzEngVallex) representing a relatively large empirical database. It includes 20,835 aligned valency frame pairs (i.e., verb senses which are translations of each other) and their aligned arguments. This new lexicon uses data from the Prague Czech-English Dependency Treebank and also takes advantage of the existin...

متن کامل

Joint search in a bilingual valency lexicon and an annotated corpus

... so I say to you ... search, and you will find ... In this paper and the associated system demo, we present an advanced search system that allows to perform a joint search over a (bilingual) valency lexicon and a correspondingly annotated linked parallel corpus. This search tool has been developed on the basis of the Prague Czech-English Dependency Treebank, but its ideas are applicable in p...

متن کامل

Bilingual English-Czech Valency Lexicon Linked to a Parallel Corpus

This paper presents a resource and the associated annotation process used in a project of interlinking Czech and English verbal translational equivalents based on a parallel, richly annotated dependency treebank containing also valency and semantic roles, namely the Prague Czech-English Dependency Treebank. One of the main aims of this project is to create a high-quality and relatively large em...

متن کامل

Verb Argument Pairing in Czech-English Parallel Treebank

We describe CzEngVallex, a bilingual Czech-English valency lexicon which aligns verbal valency frames and their arguments. It is based on a parallel Czech-English corpus, the Prague Czech-English Dependency Treebank, where for each occurrence of a verb a reference to the underlying Czech and English valency lexicons is explicitly recorded. CzEngVallex lexicon pairs the entries (verb senses) of ...

متن کامل

Towards English-Czech Parallel Valency Lexicon via Treebank Examples

The paper describes an ongoing project of building a bilingual valency lexicon in the framework of Functional Generative Description. The bilingual lexicon is designed as a result of interlinking frames and frame elements of two already existing valency lexicons. First, we give an overall account of the character of the lexicons to be linked, second, the process of frame linking is explained, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015